Introduction
Knowing the required bandwidth for optimal operation of each codec is essential in calculating the total network capacity needed for deploying Sametime audio and video. The total network capacity depends on two factors: bandwidth needed for each user, and usage concurrency rate.
Each communication session involves audio-only or audio and video, so the network bandwidth required for a user is basically the total bitrate of the codecs being used in the session. Audio codecs, in general, transmit much lower bitrates than video codecs due to the complexity nature of video coding. Organizations may choose to use audio-only as a multi-phase approach to Sametime audio/video capabilities deployment to control the network cost.
IBM Sametime provides 5 audio codecs (G.722.1, G.729, G.711, iSAC, and iLBC) and 2 video codecs (H.264 and H.263). Each codec requires different network bandwidth to operate. Within a video codec, there are many attributes that effect the data payload size and bitrate. For example, video resolutions, HD resolutions require higher bandwidth than SD resolutions.
Out-of-the-box, Sametime defaults to iSAC for audio and H.264 CIF resolution for video as preferred codecs during SIP session negotiation between two Sametime endpoints – client to client, or client to the Packet Switcher. However, endpoints may select a different audio codec than iSAC and H.264 CIF resolution to establish the call in an integrated environment with external audio/ video bridge. This flexibility does effect the bandwidth and can be configured and controlled from the external bridge.
The other factor influencing the total network bandwidth usage is the concurrency rate. This requires some educated estimation based on the organization's existing data such as assumption, usage culture or pattern on similar technology, or pilot programs. If the estimate is too high, there will be bandwidth left over, and that could be costly and wasteful. Conversely, if the estimate is too low, audio and/or video quality may not be acceptable, and other networked applications may suffer due to bandwidth capacity overflowed.
Lotus Sametime provides capabilities to protect the network from being over-run by audio and video data packets in case the usage concurrency is higher than expectation and not planned for. When deploy, each audio and video call is monitored to control bandwidth usage based on class of users and location policies. The call may be allowed, rejected, or modified to meet the utilization of the network bandwidth constraint imposed for audio and video.
The following sections describe in detail the Sametime codecs usage and network bandwidth management.
Version:1.0 StartHTML:0000000167 EndHTML:0000003289 StartFragment:0000000800 EndFragment:0000003273
Audio Codecs
Most audio codecs operate with fixed bitrate as shown in table 1 below with the exception of Sametime iSAC, which operates at transmission rates from about 10 kbps to about 32 kbps (see http://tools.ietf.org/html/draft-legrand-rtp-isac-02 for more detail)
Table 1: Sametime audio codecs, bitrates and sampling rates
Codec Name | Bitrate (kbps) | Sampling Rate (kHz) |
G.722.1 | 16/24/32 | 16 |
G.729 (only used in SUT) | 8 | 8 |
G.711 | 64 | 8 |
iLBC | 13.33/15.2 | 8 |
iSAC | 10 to 32 | 16 |
Version:1.0 StartHTML:0000000167 EndHTML:0000002203 StartFragment:0000000694 EndFragment:0000002187
Sametime uses audio channels differently in point-to-point calls vs multi-point calls. In a point-to-point call, as illustrated in Figure 1, audio data is sent directly between the 2 endpoints in the call. There is 1 sending and 1 receiving audio channel, so the transmission rate is the bitrate of the audio codec.
Figure 1: Point-to-point call, audio data is exchanged directly between 2 endpoints
Typically a 20% packet overhead is added to the data rate to calculate the required network bandwidth.
Version:1.0 StartHTML:0000000167 EndHTML:0000019538 StartFragment:0000000909 EndFragment:0000019522
(1) Bandwidth Ba = (codec bitrate * 20%) + codec bitrate
In a multi-point call, audio data is sent from the participating endpoints to the Media Manager, which relays the audio channels (each participant is a channel) back to the participants based on the administrative setting of the configuration property Number of switched audio streams (2-16) on SSC. The default value is 5; that means, the Media Manager sends a maximum of 5 audio channels to each participant, even if more than 5 participants shouting in the call. The Sametime client mixes the audio channels locally and plays out the audio.
The Media Manager trades off network bandwidth for CPU usage: It can handle more participants without processing audio on the server and let each client mix the audio channels locally to play back. This tradeoff is considered as practical norm: In usage, especially in large meetings, most participants would be on mute except the presenter. So there would be only 1 audio channel to process.
One issue that worth mentioning is that, even when the participant is not speaking, the microphone may be noisy or bad sound card that could send audio data to the Media Manager and would consume bandwidth. Therefore it is strongly recommended to use a good headset with noise canceling circuitry or be on mute when not speaking.
As depicted in Figure 2, U1, U2, and U4 are on mute, so their endpoints are not sending audio data to the Media Manager. U3 is speaking, so U3's audio data is sent to the Media Manager, who relays to all other participants.
Figure 2: Multi-point call, Media Manager relays audio channels to participants
Therefore, the network bandwidth for an audio-only multi-point call in the worst case is
(2) MBa = 5 * Ba * (Number of participants – 1), where Ba is defined in (1).
Note that (2) is using the worst case rather than the average to ensure abundant bandwidth for audio data sending from the server. If the administrator changes the maximum audio channels on SSC, the formula in (2) should be modified accordingly.
Video Codecs
Video codecs bandwidth is very different than audio codecs due to many factors influencing the encoding of the data. H.264 has many different profiles or capabilities; Sametime supports the Baseline Profile or Constrained Baseline Profile, which is typically used in video conferencing and mobile applications.
The video encoder operates within a range of minimum and maximum bitrate to encode the data based on the activity in front of the camera and the feedback from the far side. In Sametime, the maximum bitrate is set by the administrator as part of the user policy. Some group of users may have different video policy than others. The video policy includes the resolution, maximum framerate and maximum bitrate as shown in Figure 3.
Figure 3: Video specification in user policy on SSC
The video policy dictates the constraints that a Sametime video endpoint must operate within. For example, the specification above indicates that the user, who is assigned this policy, can use video at CIF (352x288) resolution, maximum 15 frames per second, and at maximum 384kbps.
The administrator may also create a custom video policy rather than using 1 of the predefined ones. The custom policy may be necessary to support certain network conditions and inter-operate with external endpoint devices.
There are many predefined video policies available on SSC for selection; some typical ones are shown in Table 2.
Table 2: H.264 codec resolution definitions
Description | | Frame Rate | |
QCIF 176x144@15fps 128kbps | 176x144 | 15 | 32/64/128 |
CIF 352x288@15fps 384kbps | 352x288 | 15 | 128/256/384 |
VGA 640x480@30fps 512kbps | 640x480 | 30 | 192/384/512 |
HD-720p 1280x720@30fps 768kbps | 1280x720 | 30 | 256/512/768 |
Estimating the exact network bandwidth usage for video is not possible. The best approach would be to base on the maximum bitrate set on the policy. However, if there are different group policies for users within an organization, the calculation should consider the mean distribution of maximum bitrates over the user population.
The Media Manager treats video streams quite different from audio streams. For a point-to-point call, similar to audio, the video stream is sent directly between the 2 participating endpoints. However, in multi-point calls, Media Manager uses Voice Activated Switching method to disseminate the video streams. That means at any given point, only the video stream of the most active speaking user is sent to all participants. For efficiency, the Media Manager notifies the other client endpoints not to send their video streams to the server. When the user is on mute or the user selects Pause My Video from the UI, no video will be transmitted to the server.
Therefore, network bandwidth required for point-to-point video is
(3) Bv = (video policy max bitrate * 20%) + video policy max bitrate
and multipoint video is
(4) MBv = Bv * (Number of participants), where Bv is defined in (3)
Bandwidth Management
Due to the estimated concurrent call rate that might not stand up with reality or known limitation of bandwidth availability, audio and video data rate should be moderated to protect the network for other business critical applications and to provide enough bandwidth for acceptable voice and visual quality.
Sametime uses SIP to negotiate media session. Embedded in the SIP message is a SDP (Session Description Protocol RFC 4566) section containing the desired session bandwidth attribute, which the Bandwidth Manager uses to monitor transmission rates on the managed network.
As illustrated in Figure 3 below, Bandwidth Manager, when deployed, will be part of the signalling path, and it will perform CAC (Call Access Control) based on the available bandwidth.
Figure 3: Bandwidth Management as part of SIP signalling
Depending on user policy, locations of the call, and available bandwidth, the Bandwidth Manager may let the call through, reject the call, or modify the media or the bandwidth attribute in the SDP. The action ensures that the total transmission rate for audio and video will not exceed the available bandwidth allocated for audio and video usage in the system configuration.
Calls are recorded with detail such as call locations and bandwidth required. Organizations may use this information to measure the usage of audio and video and their utilization of the network capacity for future planning. How much impact the deployment of audio and video exerts on the network can be calculated with the data captured by the Bandwidth Manager.
Summary
There are differences in audio and video codecs bandwidth usage due to how the Media Manage processes the data in Sametime 8.5.2. Calculating the required network bandwidth for an organization should be based on the formulas given in (2) and (4). It should be part of capacity planning to afford the most optimal network conditions for audio and video. Organizations should consider deploying Bandwidth Manager to protect the network and to ensure quality audio and video calls. Using data captured by the Bandwidth Manager enables organization to plan for future capacity.